Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 2563: Fix SST file corruption. #2564

Merged
merged 1 commit into from Jan 29, 2021

Conversation

sursingh
Copy link
Contributor

Fix SST File corruption during checkpointing

Motivation

Since the SST files are shared among checkpoints, this will not be resolved by future checkpoints. We will fail to restore all future checkpoints that depend on this file.

Changes

The record is sent asynchronously. We need to use a copy of the passed buffer
in the record. The ownership is retained by the caller and will be potentially
changed by the caller. In case of corruption the later blocks were
overwriting the previous blocks resulting in corruption

Master Issue: #2563


In order to uphold a high standard for quality for code contributions, Apache BookKeeper runs various precommit
checks for pull requests. A pull request can only be merged when it passes precommit checks.


Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

If this PR is a BookKeeper Proposal (BP):

  • Make sure the PR title is formatted like:
    <BP-#>: Description of bookkeeper proposal
    e.g. BP-1: 64 bits ledger is support
  • Attach the master issue link in the description of this PR.
  • Attach the google doc link if the BP is written in Google Doc.

Otherwise:

  • Make sure the PR title is formatted like:
    <Issue #>: Description of pull request
    e.g. Issue 123: Description ...
  • Make sure tests pass via mvn clean apache-rat:check install spotbugs:check.
  • Replace <Issue #> in the title with the actual Issue number.

The record is sent asynchronously. We need to use a copy of the passed buffer
in the record. The ownership is retained by the caller and will be potentially
changed by the caller. In case of corruption the later blocks were
overwritting the previous blocks resulting in corruption
@merlimat merlimat added this to the 4.13.0 milestone Jan 29, 2021
@dlg99 dlg99 self-requested a review January 29, 2021 18:30
Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Thank you!

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@eolivelli
Copy link
Contributor

@dlg99 we are using the dev/bk-merge-pr.py script in order to merge and cherry pick.

Feel free to commit as soon as CI passes.

We should cherry pick to branch-4.12 branch-4.11 and branch-4.10

@dlg99 dlg99 added area/tableservice changes related to table service release/4.13.0 labels Jan 29, 2021
@dlg99 dlg99 merged commit efaa993 into apache:master Jan 29, 2021
dlg99 pushed a commit that referenced this pull request Feb 2, 2021
Fix SST File corruption during checkpointing

### Motivation

Since the SST files are shared among checkpoints, this will not be resolved by future checkpoints. We will fail to restore all future checkpoints that depend on this file.

### Changes

The record is sent asynchronously. We need to use a copy of the passed buffer
in the record. The ownership is retained by the caller and will be potentially
changed by the caller. In case of corruption the later blocks were
overwriting the previous blocks resulting in corruption


Master Issue: #2563 



Reviewers: Andrey Yegorov <None>, Enrico Olivelli <eolivelli@gmail.com>, Matteo Merli <mmerli@apache.org>

This closes #2564 from sursingh/fix-sst-corruption, closes #2563
dlg99 pushed a commit that referenced this pull request Feb 2, 2021
Fix SST File corruption during checkpointing

### Motivation

Since the SST files are shared among checkpoints, this will not be resolved by future checkpoints. We will fail to restore all future checkpoints that depend on this file.

### Changes

The record is sent asynchronously. We need to use a copy of the passed buffer
in the record. The ownership is retained by the caller and will be potentially
changed by the caller. In case of corruption the later blocks were
overwriting the previous blocks resulting in corruption


Master Issue: #2563 



Reviewers: Andrey Yegorov <None>, Enrico Olivelli <eolivelli@gmail.com>, Matteo Merli <mmerli@apache.org>

This closes #2564 from sursingh/fix-sst-corruption, closes #2563
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants